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1. Introduction 



Consider the linear regression model Y = X/3 + e, where F is a random n- 
vector of responses, X is a known n x p matrix with linearly independent columns, 
(3 is an unknown parameter p-vector and e ~ N{0, a'^In) where is an unknown 
positive parameter. Let P denote the least squares estimator of /3. Also, define 
a^^{Y- Xpf{Y - X^)/{n-p). 

Suppose that the parameter of interest is ^ = a^P where a is a given p-vector 
(a 7^ 0). We seek a 1 — a confidence interval for 9. Define the quantile t{m) by the 
requirement that P[ — t{m) <T< t(m)) = 1 — a for T ~ t^- Let denote {3, 
i.e. the least squares estimator of 6. Also let Vn denote the variance of divided 
by cr^. The usual 1 — a confidence interval for 9 is 

/ = [0 — t{m)^/viia^ + t{m)^/viic^\ 

where m — n — p. Is this confidence interval admissible? The admissibility of a 
confidence interval is a much more difficult concept than the admissibility of a point 
estimator, since coufidcucc intervals must satisfy a coverage probability constraint. 
Also, admissibility of confidence intervals can be defined in either weak or strong 
forms (Joshi, 1969, 1982). 

Kabaila & Giri (2009, Section 3) describe a broad class V of confidence intervals 
that includes /. The main result of the present paper, presented in Section 3, is that 
/ is strongly admissible within the class T>. An attractive feature of the proof of this 
result is that, although lengthy, this proof is quite straightforward and elementary. 
Section 2 provides a brief description of this class T>. For completeness, in Section 4 
we describe a strong admissibility result, that follows from the results of Joshi (1969), 
for the usual I — a confidence interval for 9 in the somewhat artificial situation that 
the error variance cr^ is assumed to be known. 

2. Description of the class X> 

Define the parameter r = c^f3 — t where the vector c and the number t are given 
and a and c are linearly independent. Let f denote c^f3 — t i.e. the least squares 
estimator of r. Define the matrix V to be the covariance matrix of (0,f) divided 
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by (T^. Let v-ij denote the (i, j) th element of V . We use the notation [a ± h] for the 
interval [a — 6, a + 6] (6 > 0). Define the following confidence interval for 9 



J{b,s) 



— y/vii^ b ( - — -= ] ± a/wii 0" s 



(1) 



where the functions h and s are required to satisfy the following restrictions. The 
function 6 : M — )■ M is an odd function and s : [0, oo) — ?■ (0, oo). Both h and s are 
bounded. These functions are also continuous except, possibly, at a finite number 
of values. Also, h{x) = for all |a;| > d and s{x) = t{m) for all x > d where d is 
a given positive number. Let J^{d) denote the class of pairs of functions (6, s) that 
satisfy these restrictions, for given d {d > 0). 

Define V to be the class of all confidence intervals for 6 of the form ([T]), where 
c, t, d, b and s satisfy the stated restrictions. Each member of this class is specified 
by (c, t, d, b, s). Apart from the usual 1 — a confidence interval I for 6, the class V 
of confidence intervals for 6 includes the following: 

(a) Suppose that we carry out a preliminary hypothesis test of the null hypothesis 
r = against the alternative hypothesis r 7^ 0. Also suppose that we construct 
a confidence interval for 6 with nominal coverage 1— a based on the assumption 
that the selected model had been given to us a priori (as the true model). The 
resulting confidence interval, called the naive 1 — a confidence interval, belongs 
to the class V (Kabaila & Giri, 2009, Section 2). 

(b) Confidence intervals for 6 that are constructed to utilize (in the particular 
manner described by Kabaila & Giri, 2009) uncertain prior information that 
r = 0. 

Let K denote the usual 1 — a confidence interval for 6 based on the assumption that 
r = 0. The naive 1 — a confidence interval, described in (a), may be expressed in 
the following form: 



h[^r^]l+(l-h(^r^] ]K (2) 



where /i : [0, 00) — )■ [0, 1] is the unit step function defined by h{x) = for all x G [0, g] 
and h[x) = 1 for all x > q. Now suppose that we replace hhj a. continuous increasing 
function satisfying h{0) = and h{x) — )• 1 as a; — > 00 (a similar construction is 
extensively used in the context of point estimation by Saleh, 2006). The confidence 
interval (EI) is also a member of the class V. 
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3. Main result 



As noted in Section 2, each member of the class T> is specified by {c,t,d,b,s). 
The following result states that the usual 1 — a confidence interval for 6 is strongly 
admissible within the class V. 

Theorem 1. There does not exist {c,t,d,b, s) G V such that the following three 
conditions hold: 

(a) Ep „2 {length of J {h, s)^ < Ep „2 {length of for all {(3,(7^). (3) 

(6) Pp,^2 {9 e J{b, s)) > Pp^„2 {6 e I) for all {(3, a^). (4) 
(c) Strict inequality holds in either ([3]) or (jlj) for at least one {(3,a^). 

The proof of this result is presented in Appendix A. 

An illustration of this result is provided by Figure 3 of Kabaila & Giri (2009). 
Define 7 = Tl{(y^fv^). Also define 

^ ^ expected length of J(fe, s) 
' expected length of I 

We call this the scaled expected length of J(6, s). Theorem 1 tells us that for any 
confidence interval J(6, s), with minimum coverage probability 1 — a, it cannot be 
the case that 6(7; s) < 1 for all 7, with strict inequality for at least one 7. This fact 
is illustrated by the bottom panel of Figure 3 of Kabaila & Giri (2009). 

Define the class T> to be the subset of P in which both h and s are continuous 
functions. Strong admissibility of the confidence interval J within the class T) implies 
weak admissibility of this confidence interval within the class P, as the following 
result shows. Since (/3, o"^) is a sufficient statistic for (/3,o"), we reduce the data to 

Corollary 1. There does not exist {c,t,d,b,s) G V such that the following three 
conditions hold: 

(a') {length ofJ{b,s)) < {length of l) for all (Aa^). (5) 
{b') P^,.2 {e e J{b, s)) > P^,^2 {e G /) for all {/3, a'). (6) 
(c') Strict inequality holds in either ([5]) or (fT3l) for at least one (/3,(T^). 

This corollary is proved in Appendix B. 



4 



4. Admissibility result for known error variance 



In this section, we suppose that is known. Without loss of gencrahty, we 
assume that — \. As before, let /3 denote the least squares estimator of /3. 
Since ^ is a sufficient statistic for /3, we reduce the data to ^. Assume that the 



parameter of interest is ^ = ^i/yVar(/3i). Thus the least squares estimator of Q is 



e = /3i/WVar(/3i). Define 



/92 - 



where £2, • • 
Now define 



have been chosen such that Cov(/3j — ij^i, = for j = 2, 

^/92 - i2/3i' 



/3p (-pPi 

Note that (0, A) is obtained by a onc-to-onc transformation from (3. So, wc reduce 
the data to (0, A). Note that and A are independent, with ~ N[9, 1) and A 
with a multivariate normal distribution with mean S and known covariance matrix. 
Define the number z by the requirement that P{—z < Z < z) — 1 — a ior Z 
N(0, 1). Let 7 = [0 - + ^] . Define 



cp{9, 9) 



1 ii9e[9-z,9 + z] 
otherwise 



This is the probability that 9 is included in the confidence interval /, when 9 is the 
observed value of 0. The length of the confidence interval / is (p{9, 9) d9 = 2z. 
Let pe{-) denote the probability density function of for given 9. The coverage 
probability of / is (p{9, 9) pe{9) d9 = 1 — a. 

Now let C(0, A) denote a confidence set for 9. Define 

M0,0) = Pe,s{9eC{9,A)), 

where 9 denotes the observed value of 0. For each given S e MF~^, the expected 
Lebesgue measure of C(0, A) is Ee^s ^ V^<5(0, ^) d9^ . For each given 5 e W~^, the 
coverage probability of C{Q, A) is ip5i9, 9) pe{9) d9. Theorem 5.1 of Joshi (1969) 
implies the following strong admissibility result. Suppose that ips{9,9) satisfies the 
following conditions 
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(i) Ee,5 ( 1°°^ ifiie, 9) de) < Ee,s ( j^^ ip{9, 9) d9) for all 9 e M. 

(ii) jT^ MO, 0) pe{9) d9 > ip{9, 9) pe{9) d9 for all 9 e M. 

Then ips{9, 9) = ip{9, 9) for almost all {9, 9) e R^. This result is true for each 
5 e W-'^. Using standard arguemnts, this entails that I \ C(©, A) and C(©, A) \ 7 
are Lebesgue-nuU sets, for (Lebesgue-) almost all values of (©, A). 



Appendix A: Proof of Theorem 1 



Suppose that c is a given vector (such that c and a are linearly independent), t 
is a given number and d is a given positive number. The proof of Theorem 1 now 
proceeds as follows. We present a few definitions and a lemma. We then apply this 
lemma to prove this theorem. 



Define W — a jo. Note that W has the same distribution as y/Q/m where 
Q ~ Xm- Let fw denote the probability density function of W. Also let denote 
the A^(0, 1) probability density function. Now define 

^ expected length of J{h, s) ^ 

^ ' ' expected length of / 

It follows from (7) of Kabaila & Giri (2009) that 

Ri{b:S;-f) = / / {s{\x\) -t{m))(l){wx--f)dxw'^ fw{w)dw. (7) 

Thus, for each (6, s) e J-'{d), Ri{b, s; 7) is a continuous function of 7. 

Also define R2{b, s; 7) = P[9 ^ J(6, s)) — a. We make the following definitions, 
also used by Kabaila & Giri (2009). Define p — vyij yJv\]V22 and ^{x,y; ijl,v) — 
P{x < Z <y), ior Z v). Now define the functions 

k\h, w,^, p) = — t(m)w, t(m)w; p(h — 7), 1 — p^) 

k{h, w, 7, p) — ^(b{h/w)w — s{\h\/w)w, b{h/w)w + s{\h\/w)w; p{h — 7), 1 — p^). 
It follows from (6) of Kabaila & Giri (2009), that 

R2(b,s;^) — — / / (^k{wx,w,^, p) — k\wx,w,^, p)^ (f)(wx — ^) dxw fw(w) dw. 

Jo J-d 

(8) 

Thus, for each (&, s) e J^{d), R2{b, s; 7) is a continuous function of 7. 
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Now EiVV^) = 1 and so 



oo 



w fwiw) dw 







It follows from ([7]) that 



i?i(6,s;7)rf7 = ^^-y-^^y^ {six)-t{m))dx. (9) 

Thus Ri{b, s; 7) ^7 exists for all (6, s) G 

Since k{wx,w,'y,p) and {wx,w,'y, p) are probabilities, 

/•oo /"d 

|_R2(6, s; 7)1 < / / (j){wx — ^)dxwfw{w) dw, 



so that 

/■oo 

|/?2(&,s;7)|d7 < 2d / w/vkH du; = 2dE(iy) < 00. 



Thus -R2(&, s; 7) d'-y exists for all (6, s) G 
Thus, we may define 

/oo /"OO 
i?i(6,s;7)t^7 + (l-A) / R2{b,s;j)dj, 
00 J —00 

for each (6, s) G where < A < 1. Kempthorne (1983, 1987, 1988) presents 

results on what he calls compromise decision theory. Initially, these results were 
applied only to the solution of some problems of point estimation. Kabaila & Tuck 
(2008) develop new results in compromise decision theory and apply these to a 
problem of interval estimation. The following lemma, which will be used in the 
proof of Theorem 1, is in the style of these compromise decision theory results. 

Lemma 1. Suppose that c is a given vector (such that c and a are linearly indepen- 
dent), t is a given number and d is a given positive number. Also suppose that A is 
given and that (6*, s*) minimizes g{b, s; A) with respect to {b, s) G J^{d). Then there 
does not exist (6, s) G J-'{d) such that 

(a) i?i(6,s;7)<i?i(6*,s*;7)/or a// 7. 

(b) i?2(6, s; 7) < i?2(&*, s*; 7) for all 7. 

(c) Strict inequality holds in either (a) or (b) for at least one 7. 
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Proof. Suppose that c is a given vector (such that c and a are hnearly independent), 
t is a given number and d is a given positive number. The proof is by contradiction. 
Suppose that there exist {b,s) G J-'{d) such that (a), (6) and (c) hold. Now, 



By hypothesis, one of the following 2 cases holds. 

Case 1 (a) and (6) hold and Ri{b*, s*; 7) — Ri{b, s; 7) > for at least one 7. Since 
Ri{b*, s*; 7) — Ri{b, s] 7) is a continuous function of 7, 



Thus g{b*, s*; A) > g{b, s; A) and we have established a contradiction. 

Case 2 (a) and (6) hold and R2{b*, s*; 7) — R2{b, s; 7) > for at least one 7. Since 
R2{b*, s*; 7) — R2{b, s; 7) is a continuous function of 7, 



Thus g{b*, s*; A) > (7(6, s; A) and we have established a contradiction. 

Lemma 1 follows from the fact that this argument holds for every given vector c 
(such that c and a are linearly independent), every given number t and every given 
positive number d. 



We will first find the (6*, s*) that minimizes g{b, s; A) with respect to (6, s) G 
^-"(6/), for given A. We will then choose A such that J{b*, s*) = I, the usual 1 — a 
confidence interval for 6. Theorem 1 is then a consequence of Lemma 1. 

By changing the variable of integration in the inner integral in ([H]), it can be 
shown that R2{b, s; 7) is equal to 






□ 



POO I'd 



wx, w, 7, p) — k'^{wx, w, 7, p)) (p{wx — 7) + 
—wx, w, 7, p) — k\—wx, w, 7, p)) 0(wx + 7) j dxw fw{w) dw 
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Using this expression and the restriction that b is an odd function, we find that 
J^oo ^^ib, s; 7) dj is equal to 

pd POO pco 

— / i^{b{x)w — s{x)w,b{x)w + s{x)w, py,l — p^) 

Jo Jo J -00 ^ 

— ^( — t{m)w, t{m)w; py, 1 — p^) 

+ — b{x)w — s{x)w, —b{x)w + s{x)w; —py, 1 — p^) 

— ^' ( — t{m)w, t{m)w; —py, 1 — p^) j dy w fw{w) dw dx. 

Hence, to within an additive constant that does not depend on {b, s), R2{b, s; 7) dj 
is equal to 

pd POO POO 

— / i^(b{x)w — s{x)w,b{x)w + s{x)w; py,l — p"^) 

Jo Jo J-00 ^ 

+ ^( — b{x)w — s{x)w, —b{x)w + s{x)w; —py, 1 — p^)^ 4>{y) dyw fw{w) dw dx. 
Thus, to within an additive constant that does not depend on {b, s), 

g{b,s;\) = / q{b,s;x)dx, 
Jo 

where q{b, s; x) is equal to 
2A 

t{m)E{W)^^''' 

POO POO 

— (1 — A) / / {^{b{x)w — s{x)w,b{x)w + s{x)w, py,l — p"^) 

Jo J-00 

+ ^{—b{x)w — s{x)w, —b{x)w + s{x)w; —py, 1 — p^)) dyw fw{w) dw. 

Note that x enters into the expression for q{b, s; x) only through b{x) and s{x). To 
minimize g{b, s; A) with respect to (6, s) G J^{d), it is therefore sufficient to minimize 
q{b, s; x) with respect to {b{x), s{x)) for each x e [0, d]. The situation here is similar 
to the computation of Bayes rules, see e.g. Casella & Berger (2002, pp. 352-353). 
Therefore, to minimize g{b, s; A) with respect to {b, s) e J^{d), we simply minimize 



00 POO 



— (1 — A) / / {^{bw — sw,bw + sw; py,l — p'^) 

Jo J~oo 

+ ^'(— few — SW, —bw + sw; —py, 1 — p^)) (f>{y) dyw fw{w) dw 

with respect to {b,s) G M x (0,oo), to obtain {b',s') and then set b{x) = b' and 
s{x) = s' for all x G [0, d]. 
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Let the random variables A and B have the following distribution 



'A 




0" 




'1 


p 


B 





5 


p 


1 



Note that the distribution of A, conditional on B = y, is N{py, 1 — p^). Thus 
"^{bw — sw, bw + sw; py, 1 — p^) = P(bw — sw < A < bw + sw \ B = y'j 

Hence 

'^(bw — SW, bw + sw; py, 1 — p^) (p^y) dyw fw{w) dw 
P{bw — sw < A <bw + sw) w fw{w) dw. 



J -OO 

OO 



(10) 

Let $ denote the A^(0, 1) cumulative distribution function. For every fixed w > Q 
and s > 0, 

P{bw — sw < A <bw + sw) = + sw) — — sw) 

is maximized by setting b = 0. Thus, for each fixed s > 0, (fTOj) is maximized with 
respect to 6 G M by setting 6 = 0. 

Now let the random variables A and B have the following distribution 



'A 




0" 




"1 -p 


B 










J -OO 

OO 



Note that the distribution of A, conditional on B = y, is N{—py, 1 — p^). Thus 
'^{—bw — sw, —bw + sw; —py, 1 — p^) = P[ — bw — sw < A < —bw + sw \ B = y) 
Hence 

/•OO /"OO 

"^{—bw — SW, —bw + sw; —py, 1 — p^) (p{y) dyw fw{u]) dw 
P{—bw — sw < A < —bw + sw) w fw{w) dw. 

'0 

For every fixed w > and s > 0, 

P(^ — bw — sw < A < —bw + sw) = $(— 6w + sw) — ^{—bw — sw) 

is maximized by setting b = 0. Thus, for each fixed s > 0, (|TT|) is maximized with 
respect to 6 G M by setting 6 = 0. 



Ill 
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Therefore, q{b, s) is, for each fixed s > 0, minimized with respect to b by setting 
6 = 0. Thus b' = and so b*(x) = for all a; G M. Hence, to find s' we need to 



minimize 

A 



POO 

-s-(l-A) / (2$(sw) - 1) w/, 
' Jo 



t{m)E{W) 

with respect to s > 0. Therefore, to find s' we may minimize 



r{s)=i{X)s-2 / ^{sw)wfw{w)dw 
Jo 

with respect to s > 0, where 



;i - X)t{m)E{W)' 

Note that i{X) is an increasing function of A, such that £(A) J, as A i and 
i{X) t oo as A t 1- Choose A = A*, where 



oo 



i{X*) =2 (f){t{m)w) fw{w) dw. 



Note that < £(A*) < ^/2/n. Now 
dr{s) 



ds 



oo 



£{X*) - 2 / (j){sw) w'fw{w) dw. 



Since (j){sw) w"^ fw{w) dw is a decreasing function of s > 0, dr{s)/ds is an in- 
creasing function of s > 0. Also, for s = 0, (j){sw) w'^fw{ui) dw = l/\/2n. Thus, 
to minimize r(s) with respect to s > 0, we need to solve 



oo 



i{X*)-2 (f){sw)w^ fw{w)dw = 
Jo 

for s > 0. Obviously, this solution in s = t{m). Thus s*{x) = t{m) for all x > 0. 
In other words, J(6*, s*) = I. By Lemma 1, there does not exist (6, s) G J^{d) such 
that 

(a) Ep^^2 (length of J(6, s)) < Ep^^2 (length of J) for all a^). (12) 

{h) [9 G J(6, s)) > Pp^,. [9 G /) for all a^). (13) 

(c) Strict inequality holds in either ( IT^ or (IT^ for at least one (/3,cr^). 

Theorem 1 follows from the fact that this argument holds for every given vector c 
(such that c and a are linearly independent), every given number t and every given 
positive number d. 
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Appendix B: Proof of Corollary 1 

The proof of Corollary 1 is by contradiction. Suppose that c is a given vector 
(such that c and a are linearly independent), i is a given number and o? is a given 
positive number. Also suppose that there exists {b, s) e J^(rf) such that both b and 
s are continuous and (a'), (6') and (c'), in the statement of Corollary 1, hold. Now 
(a') implies that 

Ep^^2 (length of J{b, s)) < E^^^2 (length of /) for all (/3, cr^), 

so that (a) holds. By hypothesis, one of the following two cases holds. 
Case 1 (length of J(6, s)) < (length of /) for at least one (^, ct^). Now 

(length of J(6,s)) = 2y/v^a s ( ) , 

which is a continuous function of 0, a'^). Hence (length of /) — (length of J(6, s)) 
is a continuous function of 0, ct^). Thus 

-E'/3,ct2 (length of J{b,s)) < £'^^0-2 (length of /) for at least one {/3,a^). 

Thus there exists {b,s) G J-'{d) such that (a), (6) and (c), in the statement of 
Theorem 1, hold. We have established a contradiction. 

Case 2 There is strict inequahty in {b') for at least one {/3,a'^). Thus there exists 
(6, s) G such that (a), (b) and (c), in the statement of Theorem 1, hold. We 
have established a contradiction. 

Corollary 1 follows from the fact that this argument holds for every given vector c 
(such that c and a are linearly independent), every given number t and every given 
positive number d. 
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